1 CCTCGAAGTG CCAGGG4GCA CTQG^GGCCA CCCAGTCATC GGGGACACCT 
51 TCATCOCTCA CATCGCCGPG CTCQGOTPG AG\AGCGCTT CGTACCCAGC 
101 CAGCACTATG TCTACATUTT CCTGCTGAAA TGQCAQGACC 7CTCGCAGAA 
151 GGTGGTOAC CGGCGCTTCA CCGAGATCTA CGAGTTCCAT AAAACCTTAA 
201 AAGAAATCTT CCCTATTGAG GCAGQQQCGA TCAATCCAGA G4ACAGGATC 
251 ATCCCCCACC TCCCAGCTCC CAACTGGTTT GACGGGCAGC GGGCCGCCGA 
301 GAACCGCCAG GGCACACTTA CCGAGTACTG CAGCACGCTC ATCAGCCTCC 
351 CCACCAAGAT CTCCCGCTOT CCCCACCTCC TCGACTTCTT CAAGGTCCGC 
401 CCTGATCACC TCAAGCTCCC CACGQ\CMC CAGACAAAAA AGCCAGA&VC 
451 ATACTTCATG CCCAAAGATC GCAAGAGTAC CGCGACAG^C ATCACCGGCC 
501 CCATCATCCT GCAGACGTAC CGCGCCATTC CCAACTACGV GAAGACCTCG 
551 GGGPCCGAGA TGGCTCTGTC CACGGGGGAC GTOCTGGAGG TCCTAGAG4A 
601 GAGCGAG^GC GG1 PQGTOGT TCTX7TCAG4T GAAAGCAAAG CGAGGCTGGA 
651 TCCCAGCGTC CTTCCTCGAG CCCCTGQ\0\ GTCCTGACG* GACGG4AGAC 
701 CCTGAGCCCA ACTATGCAGG TGAGCCATAC GTCGCCATCA AGGCCTACAC 
751 TGCPGTGG^G GGGGACGAGG TGTCCCTGCT CGAGGOTGAA GCTXTTTGAGG 
801 TCATTCACAA GCTCCPGGAC GGCTGGAAAG ACGACGTCAC AGGCTACTTC 
851 CCGTCCATCT ACCTGCAAAA GTCAGGGCAA GACGTGTCCC AGGCCCAACG 
901 CCAGCTCAAG CGGGGGGCGC CGCCCCGCAG GTCCTCCATC CGCAACGCGC 
951 ACAGCATCCA CCAGCGGTCG CGGAAGCGCC TCAGCCAGG^ CGCCTATCGC 
1001 CGCAACAGCG TCCGTTTTCT GCAGCAGCGA CGCCGCCAGG CGCGGCCGGG 
1051 ACCGCAGAGC CCCGGGAGCC CGCTCGAGGA GG^GCGGCAG ACGCAGCGCT 
1101 CTAAACCGCA GCCGGCGCTG CCCCCGCGGC CGAGCGCCGA CCTCATCCTG 
.1151 AACCGCTGCA GCGAGAGCAC CAAGCGG4AG CTGGCGTCTC CCGTCTGAGG 
1201 CTGGAGCGCA GTCCCCAGCT AGCGTCTCGG CCCTTGCCGC CCCGTGCCTG 
1251 TATATACGTG TTCTATAGAG CCTGGCGTCT GGACGCCGAG GGCAGCCCCG 
1301 ACCCCTXJTCC AGCGCGGCTC CCGCCACCCT CAATAAATGT TGCTTGGAGT 
1351 GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA AA 
(SEQ ID NO: 1) 

FEATURES: 

5'UTR: 1 - 37 

Start Codon: 38 

Stop Codon: 1196 

3'UTR: 1199 



Homologous proteins: 

Top 10 BLAST Hits: 

Sequences producing significant alignments: 

CRA ~ " ' ^ " " " ' ' ' ' ' ' 

CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 
CRA 



18000004925255 /altid=gi (4557785 /def=ref |NP_000256.1| neut 
18000005124568 /altid=gi [2754713 /def=gb AAB95193.il (U5783 
18000005207006 /altid=gi [4263750 /def=gb AAD15422.1| (AC004 
18000005171728 /altid=gi 16685673 /def=sp 077774 1 NCFLBCVIN 
148000004473069 /altid=gi 8439513 /def=dbi | BAA96544.il (ABO 
118000005118410 /altid=qi 9623382 /def=qb|AAF90134.1|AF2677 
18000005141875 /altid=g? 13061282 /def=dbj BAA25649.il (AB00 
18000005020732 /a1tid=gi 1 1171669 /def=sp|Q09014|NCFUvouSE 
18000004937799 /altid=gi 12118398 /def=pir| 1 154525 leukerrria- 
148000001425618 /altid=gi (7839599 /def=gb|AAF70344.1| (AF26 



EST: 



Sequences producing significant alignments: 



91 

gi 



12896059 /dataset=dbest /taxon=960. 
12951967 /dataset=dbest /taxon=960. 
12342004 /dataset=dbest /taxon=96. 



Score 


E 


(bits) 


Value 


789 


0.0 


788 


0.0 


783 


0.0 


684 


0.0 


670 


0.0 


663 


0.0 


659 


0.0 


655 


0.0 


651 


0.0 


527 


e-148 


Score 


E 


(bits) 


value 


1532 


0.0 


1501 


0.0 


1423 


0.0 



EXPRESSION INFORMATION FOR MODULATORY USE: 
gi 112896059 placenta 

gi j 12951967 B cells from Burkitt lymphoma 
gi I 12342004 primary B-cells from tonsils 

Tissue Expression: 
Leukocyte 



FIGURE 1 



1 MGDFTFIRKLA LLGFEKRFVP SQHYVYMFLV KWQDLSEKW YRRFTEIYEF 
51 HKTIXB1FPI EAGMNPENR IIPHLPAPKW FDGQRAAENR QGTLTEYCST 
101 LMSLFTKISR CPHLLDFFKV RPDDLKLPTD NQTKKPETYL MPKDGKSTAT 
151 DITCPIILQr YRAIANYEKT SGSEW\LSTC DWEWEKSE SGWVFCQVKA 
201 KRGWIPASFL EPLDSPDETE DPEPNYAGEP YVAIKAYTAV EQDEVSLLEG 
251 EAVEVIHKLL DQWKDDVTGY FPSMYLQKSG CPVSCKJRQI KRGAPPRRSS 
301 IRNAHSIHQR SRKRLSQDAY RRNSVRFLQQ RRRQ^RPGPQ SPGSPLEEER 
351 QTQRSKPQPA VPPRPSADLI LNRCSESTKR KLASAV 
(SEQ ID NO: 2) 

FEATURES: 

Functional domains and key regions: 

[1] PDOC00001 PS00001 ASNjGLYCOSYLATION 

N-glycosylation site 

131-134 NCJTX 



[2] PDOC00004 PS00004 OW_PHOSPHQ_SrnE 

camp- and c(MP-dependent protein kinase phosphorylation site 

Number of matches: 4 

1 42-45 RRFT 

2 297-300 RRSS 

3 313-316 KRLS 

4 321-324 RRNS 



[3] PDOC00005 PS00005 PKCLPHOSPHOjSmE 
Protein kinase C phosphorylation site 

Number of matches: 10 



1 


36-38 SEK 


2 


53-55 TLK 


3 


133-135 TKK 


4 


160-162 TYR 


5 


300-302 SIR 


6 


311-313 SRK 


7 


324-326 SVR 


8 


352-354 TOP. 


9 


377-379 STK 


10 


378-380 TKR 



[4] PDOC00006 PS00006 CK2_PH0SPH0_3rTE 
Casein kinase II phosphorylation site 



Number of matches: 


11 


1 


53-56 


TIKE 


2 


93-96 


TLTE 


3 


148-151 


TATD 


4 


171-174 


SGSE 


5 


178-181 


STGD 


6 


208-211 


SFLE 


7 


215-218 


SPDE 


8 


238-241 


TAVE 


9 


246-249 


SLLE 


10 


279-282 


SGQD 


U 


344-347 


SPLE 
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'[5] POOC00008 PS00008 MVRISTYL 
N-myristoylation site 

Nutter of matches: 3 

1 83-88 GQRAAE 

2 172-177 GSENIAL 

3 280-285 GQPVSQ 

Membrane spanning structure and domains: 

NO DATA 



BLAST Alignment to Top Hit: 

XSA 1 18000004925255 /altid=gi 14557785 /def^ref |NP_000256.1| neutrophil 
cytosolic factor 1; Neutrophil cytosolic factor-1 (47kD); 
p47phox [Homo sapiens] /org=Homo sapiens /taxon=9606 
/dataset=nraa /length=390 
Length = 390 

Score = 789 bits (2015), Expect = 0.0 

Identities = 385/390 (98%), Positives = 386/390 (98%), Gaps = 4/390 (1%) 
Frame = +2 

Query: 38 MGOTFIf^IALLGFEKRFVP 217 

MGETIT^RHIALLGFEKRFVPSQHYV^ 
Sbjct: 1 MGOTFIRHIALLGFEKRFWSCK^ 60 

Query: 218 eag^inpenriirhlpapkwftcqw^ 397 

EAG^NPENRIIPHLPAPKWFDGQRAAENRQGTL^ 
Sbjct: 61 EAGAINPENRIIPHLPAPKWFDGQfWV^^ 120 

Query: 398 RP!X)LXLPTI>XFKKPETY1^PI<^^ 577 

RPDDU<LPTONQTXKPEriTLMPI<Da<ST^ 
Sbjct: 121 RPDDU<LPTDN(jn<KPETYlJ^PKDGK^ 180 

Query: 578 DWEWEKSESQMa/FCQ^<AK^ 757 

CWEWEKSESQaAaFCQMKAKRG\/I PAS FL EPUDSPDETEDPEFNYAGEPWAIKAYTAV 
Sbjct: 181 DWENA/EKSESQrtiWFCQ^KAKRQ/^ 240 

Query: 758 EGDEVSLLEGEAVEVIHKLLDGW KDDVTGYFFSMYLQKSGQDV^^ 925 

EODEVSLLEGEAVEVIHKLLDGa/ KDDVTC^FPSMV^QKSGQPVSQ^QRQIKRG^ 
Sbjct: 241 E(I>EVSLLEGEAVEVIHKLIiXa^ 300 

Query: 926 RRSSIRNAHSIHQRSRKRLSQDAY^ 1105 

RRSSIRNAHSIHQRSRKRLSQDAYRRNSVRFLQQRRRQ^Rre 
Sbjct: 301 RRSSIRNAHSIHQRSRKRLSQDAYRRNSVRFLQC^ 360 

Query: 1106 pqpawprpsadlelnrcsestkrklasav 1195 

PQPAVPPRPSADLI LNRCS ESTKRKLASAV 
Sbjct: 361 PQPAVPPRPSADLI LNRCSESTKRKLASAV 390 (SEQ ID NO: 4) 
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htrmer search results (Pfam): 
HW results: 



Scores for sequence family classification (score includes all domains): 
Model Description Score E-value 



N 



CE00053 CE00053 moxjrritogenicLjoxidase 

PF00787 PX domain 

PF00018 SH3 domain 

CE00036 CE00036 androstane_receptor 



573.6 
119.4 
107.5 
0.3 



Parsed for domains: 

Model Domain seq-f seq-t 



hnm-f hnm-t 



PF00787 


VI 


4 


121 .. 


1 


147 n 


119.4 6.6e-32 


PF00018 


V2 


159 


213 .. 


1 


57 [] 


66.7 le-16 


CE00036 


VI 


243 


257 .. 


199 


213 .. 


0.3 4.6 


PF00018 


2/2 


229 


279 .. 


1 


57 [] 


41.0 1.2e-09 


CE00053 


VI 


1 


386 [] 


1 


566 [] 


573.6 1.3e-168 



1.3e-168 
6.6e-32 
2.7e-28 
4.6 



score E-value 



1 
1 
2 
1 
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1 TACTAAAAAT ACAAAATTAG CCAGGCGTGG TQGCGCACAC CTUTAATCCC 
51 AGCTACTTGG GAAGCTCAGG CAGGAGAATC GCTTCAACCT GGAAGGCAGA 
101 GGTTGCAGTG AGCCGAGATT GTOCCACTCC ACTCCAGCCT GGGCAACAAG 
151 AGCGAAACTT CGCTTCAAAC AAATAAATTA AGGCCCAGCA IGICI IGGCT 
201 TTCATCTGCC AGACCTCAAC CCTCACCCCC AGGAGATCAG GTCCGGACCA 
251 TGAGCTGACC CTCGACTCAG GCAAGGGTTGA GTPGGTGCAG CCCTGGCCTG 
301 CTGGGAGGCA CAGGCTGCAG CAGGCPGCCr GGGGCPGAGG CCCGCCACTC 
351 ATGAACTCAT GACOTCAAT GAGCTCCAAA AGCTCTGGGC CTCCCAGGCT 
401 CTAGGGGGAG TGGGAGAGAG AGGCCTCAGC CTCTCCCTGG GCATGCTGCC 
451 CCCTCCTCAC CTCTTTGTCC CAAATCCCCT TCCTGGCAAA GCTGACAGTC 
501 TTAATATCAC TCTGGAGAAA ACTGACTCAG CCCTAAGGAA CAATTCAATC 
551 AACCATTTGC TTACTTGAGG ATTGGAACTC AAGTCTCACT CAAAGTCTCT 
601 GCCATTTTCG TCCCAGCTCT CACTGGCCCT CATCCACACA CACCCAAGGA 
651 TCAGCATCTA ACGCTTGCAT GCACACTCCC ATCCCCGCGT TCATTCACTC 
701 ATTCATTCAT TCATTCACTC ATTCATTGAC TCATTCATTC ATTCACTCAC 
751 TCATTCATTC ACTCAGTCAA TGTTGCAGTC ACGATCCAAA TATTTATGGC 
801 CTCTGTUTGC CAGGCACTAG ATGGAGGGGC TGGGGCTAGA GCCCCTWA 
851 ACCCGCTCAT GCCCTAGGT TCCTGGGACA CACATTCTGG TAAGGGGAGA 
901 CTAAAAAAAT TAAGTCAGGC CAGGCACGGT GGCTCATGCC TGAATCCGAG 
951 CACTTTGGGA GGCCGAGGCG AGTGAATTAC CTCAGGTCAG GAGTTCAAGA 
1001 CCAGCCTOGC CAACATGGAG AAACCCAGTC TCTAATTAAA AAAA A AAAAA 
1051 AAATTAGCCA GGTGTGGTGG CACATGCCTG TAATCCCAGC TACTCAGGAG 
1101 ACTAACGCAA GAGAATTCCT TGAACCCAGG AGGCAGAGCT TGCGGTGAGC 
1151 CGAGATCGCG CCATTGCACT CCAGCCTGGG AAACAAGAGC GAGACTCCAT 
1201 CTCAAAAAAA AAAAAAGPGG GAGGCAGAGG CAGGAGGATC ACTAGAGGCC 
1251 AGTAGTTTGA GACCATCCTG GGCAACATAG CAGGACCCTG TCTCTACAAA 
1301 AAAATTAAAA AAAATTTAAC CGGGCATGCT GGCACACACC CCTACTCCCA 
1351 GCTACTCCAG AGGCTGAGGC AGGAGGATCG CTGGAGCCCA GGAGTTGGAG 
1401 GCPGCACTGA ACTGTGATCC CACCACTGCA CTTAAGCCTG GATAACAAAG 
1451 GAAGACCCTG TCTCAAATAA CAATAGCAAT AATAATAAAG AAAAATTAAA 
1501 TGCAATTTGC GATGCATCAG TGATAAGTGC TCTGCAGAAA AAGGAGGCAG 
1551 GAAGAGGCTG AGAAAGGTAT GAGGTTTGCT ATGCAATGTG AAGTTATCAA 
1601 GGAAGGCTTC TCGGAAGAGG TGACATTTCA GCAGAGAAAT GGAGGAGAGT 
1651 TATCGAGGGA AGATGOTGAA TGGGGGGAAC ATGGTCAAGA CCAGGAATAT 
1701 GGTCAAGGGG GGAAAGATGG TCAAGGGGAC GCAGCAAATG CAAAGGCCCT 
1751 GAGGCAGGAG CAGCTTGATT CACCCCCAAA ACCCGTGGGG CCCCTGCAGG 
1801 CGACGGGAAG GACAAGTGTA AACCCTTTTC CTTGTCCCTG CAGGTGTGTG 
1851 TGAACATGAG TCTGCCCATG TTTACACCCT GCAAGCCTGA AGAGTCCCCA 
1901 GAAACTGAAA GAAGAAGCAA AGCCCTTTCT CTACCCTCCC TGCCCCCTGT 
1951 CCCGACCGCG ACAAAAGCGA CTTCCTCTTT CCAGTGCATT TAAGGCGCAG 
2001 CCTGGAAGPG CCAGGG^GCA CTGGAGGCCA CCCAGTCATG GGGGACACCT 
2051 TCATCCGTCA CATCGCCCTG CTGGGCTTTG AGAAGCGCTT CCTACCCAGC 
2101 CAGCACTATG TGAGTAGCTG GTGGAGGGGA TCCCGTGGGG GGAATACGGG 
2151 AGGGACAGCA CGGCCACCCT TGCAGTCCGA GGGCCAACCA GCTCCAGTGA 
2201 GGACTAACGG GGCAGGGTCT TGGGCACCTG GTCCCTCCTC TTTGAGCCTG 
2251 GATCTACCCC TCTGATCCCT GGGAAGACAG TTCCCTTGGA CCCGCCCTGG 
2301 GCCCCAGCCC TTTACTGTCC CCGCCTGTCT CCCCAGCCAG GCCCTCAGCC 
2351 TTAGCCAGGA GTCCTCTTTC TGCTCCCCTG CCATGGCCAG GCAGCCCAGC 
2401 GCTCTCrCAG GTCCGAGGCC CACTCCTCCA GGAAGCCTTC CCTGACTAGC 
2451 CCAGCTATGA GAGAGTGGCC CTCCCAAGAG GGAGGCCTGG AAACTAAAGC 
2501 TCTCTCTCTC CCCAGCTGCC TGTAGTGTCA GTTAGAGTCT TATCCTCTCC 
2551 AGTAGGGTGA CACCATCACA GGGGCCAATA GACTCCTCCC ATCTGTCCCC 
2601 AAGGAGGCTG GACAAATCCC TCCTCAGACA CACAAGTCCA CTGGCTCCCC 
2651 TAATCCCATA GGAAGGCCAG GGAGGAACTA CATTTAGGAA ATTGAAGCTT 
2701 GTATGGAACA TTTAGTCCTA TGTGCCAAGA CCTTTCTCTT I I I IGI I ATT 
2751 Illl IGTGTT TTGAGACAGA GTCTTGATCT GTTGCCCAGG CCAGACTGCA 
2801 GTGGCACGAT CTCAGCTCAC TGCAACCTCC GCCTTCCAGG TTCAACTGGT 
2851 TCTCCTGCCT CAGCCTCCAG AGTAGTTGGG ATTACAGGTG CCCACCACCA 
2901 CGCCTGGCTA ATTTTTCTAT TTTTAGTAGA GACAGGGTTT CACCATGTTG 
2951 GCCAGACTGG TCTCAAACTC CTGACCTCAA GTWCCACC CACCTGGGCC 
3001 TCCCAAACTG CTGGGATTAC AGGCATCAGC CACCGTGCCT GGCCTGI I I I 
3051 TTTGAAATGA GGTCTGGAGT GCACTGGTGC GATCATAGTT CACTGCAGCC 
3101 TCAACCTCCC AGGCCCAAGT GATCCTCCTG CCTCAGCCCC TTGACTAGCT 
3151 GGGGCTACAG GCGCACACCA CCATGCCTGG CTAGMMIA AAA 1 1 I I IGT 
3201 GGAGATGAGG TTTCACTATC TTCTCCAGGC TAATCTTG4A CTCCTCGGCT 
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3251 TAAGCAACCC TCPQOTCTCA GCCTCCCACA GTGCTAGGAT TACAAGCGTG 
3301 AGCTACCGTC CCTAGTCACT TTTCTCCTTT TLI I IGIAAC TTTCAGTTTT 
3351 GAAATTTCAA ATTTACAGAA AGGCTACTCG GTCTCAAAAC GGTACCAGTC 
3401 ACTCCAATAG TCTTTCACTC ACCTTCATCC ACACCTCTCT TTCTGGGGAT 
3451 Af I I rCTCAA TTA7TTGAGA GTGAGT7GAA GACGIGIIIC TTTACCTCTA 
3501 AATACTAGTT GTTGGGCATT TCTTAAAATC AAGGCATTCT CTTACATAAT 
3551 CACAAGACAC GTCTCAAAAT CAGGAAATTA ACATGGACAA AACACCATTA 
3601 TCCACCCACA GAOTTACre AGGTTTCCCC GATTATCCTG CTTGTCCTCT 
3651 GCAGTGAAAA CI I 1 1 1 IGAG GTCTAGGATC CAGTTCAAGGA TCAATTjTCAT 
3701 AGCCTTTAAC CTTCTTTAAT CTGGATCAGT CI 1 1 I I ICi I I I ICI I 1 1 IC 
3751 IIIIIIIGGA CACGGAATCT CACTCTGTCG CCAGACTGGA GTGCAGTGGT 
3801 GCAATCTCGG CTCATTGCAA CCTCTGCCTC CTCGGTTCAA GAGATTCTCC 
3851 TGCCTCAGCC TCCTCAGTAG CTGGGAATAC AGGTGCGCGC CACCACGCCC 
3301 AGCTCGTTTT TGGTAGAGAC AGGGTTTTGC CATTGATTCT GGATCAGTCT 
3951 llllllllll TTTTATCAGA TGGAGTCTTA CTCTGTCACC CAGGCTGGAG 
4001 TGCAATGGCA CAATCTCCAC TCACTGCATC CTCCGCCTCC CAGGTTCAAG 
4051 CAATTCTCGT GCCTCAGCCT CCCGAGTAGC 7GGGATTACA GGCATGCGCC 
4101 ACCATCCCCG GCTACI I I 1 1 GTAI I I i I AG TAGAGACAGG GTTTCACCAT 
4151 GTTAGCCAGG CTWCTCGA ACTCGPGACG TCAGGTGATC TGCCCGCCTC 
4201 GACCTCCCAA AGTGCTGGGA TTACAGGCGT GAGCCACCGT GCCAGCGGAT 
4251 TCTGGATCGG TCTTAATCAG TCTTTCTCTT TTGCAACTTT GATGTTTTGC 
4301 AGAGAGCAGA CCAG1TACCT TCTAGAATCT CCCTTAGTTT GGGTTTATCT 
4351 TCATTAGATT CAGTTTGTGT ATCCAGGGCA GTGGATCTTA GATGCAATTC 
4401 TGTCl ICI I I TTAAI I I I I I TGAGAGGGAG TCTCGCTCTG TCACCCAGGC 
4451 TGGAGTGCAG TGGCACAACC TCAGCTGACT GCAGCCTCCG CCTCCCGGGT 
4501 TCAAGCAATT CTCCTGTCCC AGCCTCCCAA GTAGCTGGGA TCACAGGTGC 
4551 CCATCACCAC TACCGGGTAA [III IGTCTT TTTAGTAGAG ACAGGGTTTC 
4601 ACCATATTGG TCAGGCTGGT CTTGAACGCC TGACCTCAGG. TGATCCACCT 
4651 GCCTTGGCCT CCCAAAGTGC 7GGGATTACA GACGGGAGCC AACATGCCCA 
4701 GCCTTCCTGC CCCTCCCGTC CCCTCCCCTC TCCTCCTGTC CCCTCCCTTC 
4751 CCCTCCCCTA TCCTCATCTC CCCTCCCTTC CCCTCCCCTC CCCACCCAAG 
4801 CTGGAGTGCA GTGGTGGAAT CATAGCTCAC TAAAGCCTTG ACCTCCAAGT 
4851 CTCAAGCAAT TCTCCTGCCT CACCTGGGGC CACAGGTGTG CGGCACCACA 
4901 CCCGGACAAT TTTTCTGTTT TTAGTAGATA TGGGGGTCTC GCTATGTTGC 
4951 CCAGGCTGGT CTCAAACTCT TGGACTCAAG CGATCTTCCC ACCTCGGTAC 
5001 TAAAAAGTGC TGGGATTCCA GGTGTGAGCC ACCGTGCCOV GCCTAGGTCC 
5051 TACTTTTATC TCCAATTTAC AGATGAGTTCC ATTTGAGAGA AGCTGACCCT 
5101 CTTGCCCTGG GTCTCAAGGC TGGGGCGTGG CAGCACTTGG GTCCACGTTT 
5151 GTGCCCTTTC TGCAATCCAG GACAACTGGA AAGATGGTCC TCACCCCAAT 
5201 CCTCTGGGCT TCCTCCAGTG GGTAGTGGGA TCCTGGGTGC ACACAGCAAA 
5251 GCCTCTTTGG AGGCTGAATG GGGTCCCCCG ACTCTGGCTT TCCCCCAGGT 
5301 ACATGTTCCT GGTGAAATCG G\GGACCTGT CGGAGAAGGT GGTCTACCGG 
5351 CGCTTCACCG AGATCTACGA GTTCCATGTG AGTCTGGGGA CGGAGGAGGG 
5401 AO\GGGACCC ACCGTTCCAG CTCCACCCTT TGGGAAGGAC CTTAGCCCAG 
5451 GTGATGGGGA AACTGCAGAA CCCAGAATCC CCTCCCAGAC CACAGTTAM 
5501 GGGGATTTAT TTATTTATAT AAA I I I I IGT GACAGGGTCT TGCTCTGTCA 
5551 CCCAGGGTCT TGCTCTGTCA CCACTCTGAA CACCTCATGT TCTCTGATTA 
5601 CAGGCATGAG CCCCCACGGT CGGCCTTTTA GGTGGTTTTG AGAGGTATTT 
5651 AGGTTTGCAG TGCAGGGGCG CAATCATAGC TCACTGCAGC CTCAACCTCT 
5701 GGGGCTCAAG CGATCCTCCT GCCTCAGCCT CCTGAGTAGC TGGGACTATA 
5751 GGTGCGCATC ACCATGTGTG GCTAAI I I 1 1 GTAI I I I I IA TAAAGATGGG 
5801 GATCTCACTA TGTTCCCCAG GCTGGTCTTG AACTCCAGAC CTCAAGTGAT 
5851 CCTCCTGCCr TGGCCTCCCA AAGCTAAGGG GGCATTAAAA GAAAAAAACA 
5901 TTTTTCCCCC TGAAACATTT AAGTAGTCTT ACTGAAAACA ATAAAACACA 
5951 GAAACACCAG ATTCTCATTT TAAAGTAAAA CAGACAGGAT CTCCCAGAAC 
6001 CTTCCTAGAA TGGAACCATT CTTGTCGCTT TTGAAAAACA AAGCCAAGTT 
6051 CTAGATCCCA AATAAATGCA CCTGCTGGTG AACATTCTCC TTGTGGTTCT 
6101 CGTCCCTATG TTAGTTATTT TCCTAAATTT TACATTTCTA CCTTTTTAAG 
6151 AATGAGTTAT CAGI I I I I 1 I ATATTTGCTT TTCTTTTGAG ATGGGGTCTT 
6201 GCTCTGTCAC CCAGGCTGGG GTGCAGTGGT GCAATCACGG CTCACTGCAG 
6251 CCTCAACCTC CAGGGCTGAA GCGATTCTCC CATCTCAGCC TCCCATGTTG 
6301 AGATCACAGG TGTGCACCAC CACACCTGGC TCCTTTTCCT GAI I IGI I I I 
6351 TTGTAGAGAT GGGATTTCGC TATCTTGCCC AGGCTGGTCT CTAACTCCTG 
6401 GACTCAAGTG ATCCTCCCGC CTCAGCTTCC CAAATTGCTA GGATTACAGG 
6451 TTTGAGCCCC TGCACCTGGT CAACCTGAGT TTTAAGAGGA TCCCTTTGGC 



FIGURE 3B 



6501 GACPGQVTTG AGG4CAO\CA AGAGTOGACG GGGG^CACAA GGAGGCCATT 
6551 TTCGTTATCC AGGCCTQGTA GTOQCTAGOG CCAGGAGGCT GGGCTTGGTG 

6601 ggaagcaotc agatcccaaa g^gatttcqg gattcgaagc aaaagg4ttt 
6651 gctggtc^ct tccacatcgg aggg^gagag gtcagtgcct ctcttaatca 
6701 agg4atccag attgccaccg aaatttctag gcccgagata tttaggtagt 
6751 gtctcacrct gtcacccagg atggagtgca gtggcgccat ctcggctcac 
6801 tgtaacctcc gccpcccagg tttaaacgax tctcccacct cagcctcctg 
6851 agtagctggg attacaggca tctgccacca ctcccggcta al 1 1 1 igtat 
6901 ttttagtaga gacggggttt caccacgttg gccaggctgg tcttcaactc 
6951 ctcacctcaa gtgatccacc cacgacagcc tcccamgtg ctgggattac 
7001 aggcgt^\gc caccatgctc ggcotttag gtggtttto\ gagotattta 
7051 ggtcacttcc aatctcgtgc ttttccaagt gttctaaact acaaatattc 
7101 cttcacctct tli igili ii ttaatcttta gaaaacctta aaagwvtct 
7151 tccctatpga ggcaggggcg atcaatccag aqvxcaggat catcccccac 
7201 ctcccaggtg agcacggggc tcagccgcxt gtcagggggt cattggcggg 
7251 ggctcacctg ccctcccagc acctctcggg cttgacctca tgttctcnqg 
7301 tgccagctcc caagtggttt cacgggcagc gggccgccga g^accaccag 

7351 GGCACACTTA CCGAGTACTG CAGCACGCTC ATQ\GCCTGC CCACCAAGAT 

7401 crcccGcrcr ccccacctcc ttg^otctt caaggtccgc cctgatgacc 

7451 TCAAGCTCCC CACGGACAAC CAGTGAGPGA ACTTTTCACC CTGCCAGGTG 
7501 GGAGAGGGAA GCAGGGGTGG GACTTTCTCT GTTTTGCACA TG4GGAAACC 
7551 AAGGCTG\GA GAGGGAAAGC CACCTTCCCA GAGCCACACA GCCAGAAAGA 
7601 GGAGGCAAAT TCCACCTCCG GCCCCTGTGA CCCGGCCAAG CCTCCACCTT 
7651 AATCTTTCAC ACCTCAGGGC ACTGGGGGAA GCACTCGGGG CTGGAGGTTC 
7701 AAAGTCCTGG GTCCTCATCC TGACATTATG GCCACCTGGC CATGGGACCT 
7751 GGAGCONGTC ACCACTGCTC TCTGAATGCA GGTTCTCCAT TTCTATAATG 
7801 GGCAGKyXGG ATCAGATCAA GCATTGGGTG TCTTGCGGAG CCCCCCAGAA 
7851 GGATGTGGGG TTG^TGCCTC TGCTAAGTGC TGAGCATGTC TGGGGTCTCC 
7901 TGTACCCAGG ACCCTGTTCTG QVAGGCACCT GAG^GGCPCA GGGAGCTCCA 
7951 GGCAGGCTGG GGAAGTCCCC TTCTCCACTC CTCTCTGGTC ACTGAAGCTC 
8001 GAAGTGGGGA GCATGAGGAC AGGACGTTAC CCC7TGTCAA GGCACCCAGG 
8051 CPGCCAAGAC AGAGACAAGC AGCATTGCTC CGGCCAGCAC TTATTGACGC 
8101 TTGAAGGTCT CCCCTGGCCC AAGGAAGGGC AGTTATCATC AGCCCGGGAG 
8151 GCGGGGGAAG GATGGACTCT GCAGTGGGGT CCGCTCCTCA TTGCCTGCTC 
8201 TCTCAGGGCr CCAG4AGGAG CAAGAGGCCG GGCACAGPGG CTCACACCTA 
8251 TAATCCCAGC ACFTTGGAAG GTCGAGGTOG GCAGATCACC TGAGGTTGGG 
8301 AGTTTCAGAC CAGCCTGGCC AACATGGTGA AACCCCATCT CTACCAAAAA 
8351 TATAAAAATT TAGTCAGGCA TCGTGGPOTG CGCTTGTAAT CCCAGCTACT 
8401 TGGG4GGCCG AGGCAGGAGA ATCGCTTG4A CCCGGGAGGC AGAGGTTGCA 
8451 GTGAGCTGAG ACTGCGCCAC TGCACTCCAG CCTGGGTG^C AGAGCG^GAC 
8501 TCTGTCTAAG AAAAAAAAAA G4AAAGAAGA AAG4AGATGG CCTGGGAGCC 
8551 CGCAAOAGCA TTTTCCAGGC TTAGGGCATC CTTTGGGTCT GCAGAAGGCT 
8601 ATGCAGTGTC CTCCTCATOT CCCTCCCTTG GGCTGCCCGA GCAGATCCGC 
8651 CCGCCCCCAT OVCTTCQIGA AGCCC7TCCT CAGCCAGTCC AGTTCCTGTC 
8701 TTCrCKICGC ACTGCCCCTT CCCTTTCCCG GOTCCCTCTT CTCTTIGGGAA 
8751 GTTCTTCTGC AGGTCTACCC AGTGCCTCTT CTTCCTCCAT GGGAAGCCAA 
8801 GGGTCTCACC CAGACTQTC TCTCCTCAGG ACAAAAAAGC CAGAGACATA 
8851 CTTGATGCCC AAAGATGGCA AGAGTACCGC GACAGGTGAG AGG4CGGGGG 
8901 GG\GCCGGCG GGGGGGG^CA CCCTCAGGAG ACCCAGAGTO TTCAGGG^AT 
8951 GGAGCAGGGG CTGGGAGCAG GCTGGGAGGG CTCACAGCTA CCCTGCTGAA 
9001 GAATTGGCTC TTTGGGCCGG GFGCGGTTGC TCATGCCTCT AATCCCAGCA 
9051 GTTTGGGAGG CCGAGGCAGG TGG4TCACTT GAGGTCAGGA GTTTGAG^CC 
9101 AGCCTGGCCA ACATCCAGAA ACCCTGTCTC TACTAAAAAT CCAAATTAGC 
9151 CAGGCGTGGT GACAGGTGCC TCTAGTCCCA GCCACTTGGG AGGCTGAGGC 
9201 AGGAGAATTG CTTC^ACCCG GAAGACGGAG TTTGCAGTOA GCCGAGATCG 
9251 TGCCACTGCA CTCCAGCCTG GGCAGCAGAG CCAG^CTCCA TCTOWVWA 
9301 AAAAAAAAAA AAGAAGAATT GGGTCTTTCG AAGGTCCCTC Gfi&kCTGMA 
9351 GGAGCCCTTT GCAGGTGGCA GTGCAGAG^C CAGCGCAGAC CCTTGCTACT 
9401 GGCAGCCGGG GGAGTCTTTG CGGCTG4ATG AATG4ACAGG TTTTGGAGGG 
9451 CAGCGTGGCC TTCAGAGGCG ATGCAGGGCT GTGGCAGTTT CTAATACTTA 
9501 TTGCACAGTC ACTGCTAATA ACAATAATAA TAATAATACC TAACATTAAT 
9551 GGAGTGCTTA CTCTUTGCCA GCCACTATTT TGI I I I IGl I GTTTTCAGTG 
9601 ACAGGGTCTC GCTCTCTTGC CCAGGCCAGA GTGAAGTGGT GTGATCATAG 
9651 CTCACTACAG CCTG&\CCTC CTGGGCTGAA GCGATCCTCC CACCTCAGCC 
9701 TCCCAAGTAG CTGGGATTAC AGGTCTGTGC CACCATCTCC AGCTAATTTT 
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9751 TAATTTTCTG ATAGAGATCG GGTCTCACTA CATTGCCCAG GCTCOTCTTA 
9801 AQCTCTTQQC CTCAAGCAAC CCTCCTOCCT CAGCCTCCCA AAGTGCTGAG 
9851 ATTATAGACA 7GAGCCACTG TCCCCGGCTT TTTCTTCTTC TTATAAGGAC 
9901 ACGAGGCCTG TTGGGTTAGG GCCCACTOA CTGACCTCAT TTTAACTTAA 
9951 TTACCTCTTG AAAGCTACTT AAGAGTACCT TTCTCTTAAT ACACCCACAC 
10001 TCTAAGGTAC TGGGTGG1 IA GGACTTCAAC ATATGAA7TT TGAGAAGGCG 
10051 GATCTCAGCC AATACCAAAC AGCATCAGCA CCTCCACGGT TGGATGAAGG 
10101 GCTGGTCAGA AATGCACACT CAGGTCCCAC AGTGGACCTA CTCAACAGGA 
10151 TAGGCATTTT AGCAAAATCC CAGOTATTCG GSTCCACOT AAAG1TAGGA 
10201 AAAGGTCAGG CACTGTGGCT CATGCCTOTA ATCCCAGCAC TTTGGGAGGC 
10251 CGAGGCGGTT GAATCACCTG AGGTCAGGAG TTCGAGACCA GCCTGACCAA 
10301 TATCGTGAAA CTCCATCTCT ACTAAAAATA CAAAAATTAG CCAGGTUTGG 
10351 TGGCGGGTGC TTCTAGTCCC AGOACTTGG GAGGCTGAGG CAGGTGAATT 
10401 ACTTGAACCT GGGAGGTGGA GGTTGCAATG AGCCAAGATT GCACCACTGC 
10451 ACTCCAGPGA CAGAGCGAGA CTCCATCTCA AAAAAAAAAA AAAAAAAAGT 
10501 TGGGAAAAGG CGAGGTGCAG TGGCTCCACG CCTGTAATCC CAACACTTTA 
10551 AGAGGCTGAG GTGGGAGAAT COTTCAGCC CAGGAGTTGG AGACCAGCCT 
10601 GGGCATTCTC COAAGACCTT GTCTTTAGAA AAAATTAGCC GGGTCTGGTG 
10651 GCATACGTCT GTGGTCCCAG CTATTCGGGA GGCTCAGGCA GGGAGATTGC 
10701 TTGAGCCTAG GAGTOAGGG CTCTAGTCAG CTCTCATCAC GTCACTCTAC 
10751 TCTAGCCTCG GCAACAGAGC AAGACTCTCT CTCCAAAAAA GAAAATAAAG 
10801 TTGGGAAAGG CTCACTAACT TGATCAGATG AGAAOVAAGA CATGTTTGAA 
10851 GTGTGAGGCC GAAGCCTGGA GAACGCTATG CGCCCAGGAA ATGCAGGGGA 
10901 GCAGAGACTC AAGATGCCAG CGCCTGTTCT GGAGGCCCAG ATGGGCCCTG 
10951 CAATCCCCAC TCACCCTGCC CTCCCTCTTG CCCCAGACAT CACCGGCCCC 
U001 ATCATCCTGC AGACGTACCG CGCCATTGCC GACTACGAGA AGACCTCGGG 
11051 CTCCGAGATG GCTCTCTCCA CGGGGGACGT GGTGGAGGTC GPGGAGAAGA 
11101 GCGAGAGCGG TCAGACCTCC CACCTTACGG GGCTCCTTCC CCTGGTGCTC 
11151 AGGAACCCAC AGCCACAAGC CCCCTGCCAA GGCTCAGGCA GCCTGGCCCC 
11201 TQGGAGGACT CCAGCTCTGT TAGGGGCCCT AAATGTCCTC * CCCACACTGT 
11251 GGGTCGCCTT CTCTCTTAGT GTGCACCCTG TGGTGGCTGT GGGCATCTGT 
11301 GCATGGCAGG CCGGGGCGGG GCATGTOGC GTCTTCTGTC TGGATCGGTA 
11351 TGGGACCGTC TGTTCATTAT GAAGTGGGCT CAGAGCTGTG ATTCTGTGAG 
11401 CATGTGTGCA TGCATGCATG TGACCTCATT GTCCAGTGTG GTGAAGGTGA 
11451 CATTTCCAAA TCTCAGCATT GGACATCAGT GTGTCTCTGT CCCTCTUTCC 
11501 TCACCATCCC TGATGGCTGC AGGGAGCCGC TGGGCCCTGC CCCTGAGTCA 
11551 CATTCCCGCA CCTCTGGCAC AGGTTQGTGG TTCTGTCAGA TCAAAGCAAA 
11601 GCGAGGCTGG ATCCCAGCAT CCTTCCTOGA GCCCCTGGAC AGTCCTGACG 
11651 AGACGGAAGA CCCTGAGCCC AACTATGCAG GTGCCCCCTG CCCTCCGAGG 
11701 CTCTAGGGGT GTGGGAGAAA GGGGCAGGCA GGGCTCAGGG ATATTGAGTG 
11751 ACTGCTTTGG AGTCTGGGCT GGTTGCTGGC TTGGCAGAAA AGTCAGGGCT 
11801 AAGATCTCAT CGGCTCTGGC TTGGGGGCCC TGGCAGGTTG TGATGCCCTT 
11851 GGTCTGGACA GGGAACCAGG AGGAGGAGCA GACGACTCGG GAGAGTGGGA 
11901 GGCCAGTGGT GTCTCTCGAT ATGTCGCCAG GTTCAGTGGG AAGCTGAAGG 
11951 ATGAGCAGAC CTTAGGCTCA GGAAGGAGGG CTGCCTGGAA GTCGGGGCAT 
12001 CATCACTGAC CAGAAAGGGA AAACTGGCAG TGCCAGGGCT GGATGGGGCC 
12051 TGCATTGAGC TTGAAAAAAA CTATAATAGA ATTGGTTACC ATTTTATTTT 
12101 ATTATTTATT TATTTATTTT AC! I 1 1 I IGA GATAGAGTCT CACTCCCTTG 
12151 CTAAGGCTGG AGTGCGGTGG TGOATCTCA GCTCACTGCA ACCTCTGCCT 
12201 CCCAGGATCA AGTGATTCTC CAGCCTCAGC CTCCCCAGGT AGCTGGGATT 
12251 ACAAGCATGC ACCACCATGC CTGGATAATT TTTOTATTTT TAGTTGAGAC 
12301 GGGGTTTCAC CAGGTTGGCC AGACTGCTCT CGAACTTCTG ACCTCAGGTG 
12351 ATC7GCCTGC CTCGGCCTCC CAAAGTGCTG GAATTACAGA TCTGAGCCAC 
12401 TGTCCCTCGC CTGGTTACCC ACATTTTAAA ATGGAGTGAT TTCACCCTTT 
12451 TATGTGGATT TACAGCTTGT 1 1 I I II I 1 1 I TTTTTGAGAC AAAGTCTGGC 
12501 TCTOTCACCC AGGCJGGAOT GCAGTAATGC AATCTCAGCT CACTGCAACC 
12551 TTAGCCTCCr GGCTTCAAGC AATTCTCCTG CCTCAGCCAC CTGAOTAGCC 
12601 TGGGGTTACA GGCATGCACC ACCACGCCAG GCTAAI I I I I TCTAI 1 1 I IA 
12651 GTAGAGATGG GGTTTCGCCA TGTTGGCCAG GCPOGTCTCG AACTCCTGAC 
12701 CJCAGGTGAT CCGCCCGCCT TGGCCTCCCA AAGTGCJAGG ATTACAGCTG 
12751 GGAACCACCT TGCCCAGCCT GTGGCTATCG TTTAAACACT GGGAAGGCCT 
12801 GCAGCCCCCA GGCOGACAGT TAGCTGCAGC TCAGCAGTTC CCAGTGCCAG 
12851 GTAGACGGAT GCTCCACCCA CCTACTCATG GCTGATCTCT TCTCATAGTG 
12901 AAGTGTOGG ACAGACCTTC ATCGTTATGG GATCTCTGGT CCCCAGAGTG 
12951 GGTGGCAATG AATCGGAGTG GACAAGCTCA CCTGGGTGTA GGGGGCAGAG 
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13001 GGCCGAAGTC o^gagtctac ccccagagtg ggtgcgagca ggagcttccc 

13051 GAGGGATCTG GGA7GGAGCA GGAGGGTCGA GGGAGGAGAC CCAGAAGAGG 
13101 GGGAACTCTG GGCCCTGGCT GGGTCPGGAG 7GCCTGGAGG MGCCCAGGC 
13151 GCAGAGAGGA GAAGATGGGA TCGGTGGCGA GCCCCAGGCT GGGCCGACCT 
13201 CACACTCTTQC TCTUTCCCCC TCCCGTGGAC CAGGTGAGCC ATACGTCGCC 
13251 ATCAAQGCCT ACACTQCTGT GGAGGGGGAC GAGGTCTCCC TGCTCGAGGG 
13301 TGAAGCTUTT GVGCTAATTC ACAAGCTCCT GGACQQCTQG TGGGTCATCA 
13351 GGTAQGAQQG CCCCTCPCCA TCCAGAGCAC CCATCTGACT CAGCCCCAGC 
13401 CAGGACGGGG TCTTTAGGGA TCPQGGOTGA CT7CTCCCTG GGACTCTGGG 
13451 TAAGCCACTG CCCOPCTCTC QQGTAGTTT CCATCTCACT AGCAGGGAGG 
13501 GATGAGCCCft CCCTTCCCTG TCTTCTGGGG ATCCAATGTC CTTGTCCAAG 
13551 TGGGTGCATT TCTCL I 1 1 G I GATTTAGGCT CTCTTCCCAA CCATCTATTA 
13601 TTATTCOTC TCTGGCAACA TGGTGAACTG TTCTATAAAT AATTACATTC 
13651 CTAGCTAGGC GCAATCGCCC AGGCCTCTAA TCCCAGCACT T7GGGAGCCC 
13701 AGGACAGGAC GATCACGTGA GCTCAGGACT TCGAGACCAC CCTGGCCAAC 
13751 ATGGCAAAAC CCTATCTCTA CTAAAAACAC AAACATCAGC CGGGTX7TTCT 
13801 GGTCGGAGCC TCTAATCCCA GCTACTCGGG ACTCTGAGAC AAGAGAATGA 
13851 CTTCAACCCG GGAGGCGGAG GTPGCAGTCA GCCAAGATOG CGCCATTGCA 
13901 CTCCAGCCTG GGCAACGAGA GCGAAACTCC CTCTCAAAAA AAAAAAAAA A 
13951 AAAAAAGATT AU I ICI I I I TATCATTCCT TTATCTTTTA AAGCI I ICI I 
14001 GCAOTCAGCT GCAGTCTCTC ATGCCTCTAA TCCCAACACT TTGGGAAGCT 
14051 GAGCTGGGAG GATCACTCAA GGCTACAAGT TCAAGACCAA CCTGGGCAAT 
14101 CTAGGGAGAC CTCTCTCTCT ACAAAAAAAA TTAAAAAATA GCTGGATCTG 
14151 CTAGCACACA CCTCTAGCCC CAGCTACTCA GGAGGCTGAG CTGAAAGGAT 
14201 OKTTGACCC CAGCAGTTGG AGGCAGCAGT GAGCTATGAC TGCACCACTG 
14251 CACCCCAGCC TGGCTGATGG AGCAAGACCC TGTCTCAAAA AAAAAAAAAA 
14301 AAAAAAAGCT TCCATTGCAA TTCCCATCTG TTTATCCTCC AAATGAATGC 
14351 AGAAATACTA ATTATCTTTT TTCTGGTTCT GGGGAACACA GAA7TCTAGC 
14401 GGCTTCTGGA GCCATT TCCC TGGAGCCATG GGGCCTCCCA GCTCCTTTCC 
14451 TGIGICI ICA I 1 1 1 1 IACGA Al I I I I I CAT I 1 1 I IGAGAC AGGATCTTGC 
14501 TCTGACTCCC AGGCTGGAGC ACAATCATCG CTCACTCAAG CGATCCTCCC 
14551 ACCTCAGGCr CCCACCTAGC TGGGACTACA GGTGAGCACC ACCACATCTG 
14601 GCTAATCTTT TTTAATTTTT TTCTAGGGCT GGGCTCTCAC TATGGTGCCA 
14651 AGACTACTCT TAAACTCCTG GCCTCAAGAG TTCCTCCTGC CTTGGCCTCC 
14701 CAAAGCACTG GGATTACAGG AATGAGCCTC CATGCTGGGC CTTTGCTGGC 
14751 CTCTTOGAG CCCTAGCTCA CAGGGCCAGC CTGGCGCCCT GCCGCAAGCT 
14801 TATCTTAAAG CTGGGACCAC AACATGCATA CCTGCAGCCG GGCCCGGGGC 
14851 CAGAGGGCTT TC^GGOVGCA TTTCTCAGCC TTTTAGACAC ACACTCTGTT 
14901 AACCCCCATC CTGTCTCTCT GATAATCTTC TTGTGATCCT CC CACCAG CC 
14951 AAGAATTGGG TTTTATCTGA ACCTTGTATT ATGCAAAGTT TTLI I I IGI I 
15001 I I I I I I I ICA CTCCCAAATA TAATATTGAG AATAGAAAGA AACTCTTTTC 
15051 AACAAATGCT GCTGGAACAG ATGGATTTCC ATACTGGAAA AAAAAAAAAA 
15101 AGAGCAAAAA ACAAACCTAG ACCCCTTCCT CACACTCTAC ACATATGTTT 
15151 ACTTCAGATG GATCACAGGT TTATCCCAGA CTAAAACCTC AAACTAAAAA 
15201 CCATTTGGGG CTGGACAGGG AGCTCACGCC TGTAATCTCA GCACTTTGGG 
15251 AGGCTGAGGC AGGTGGATCA CTTGATCTCA GGACTTTGAG ACGAGCCATG 
15301 ACCAATATGG TGAAATCCTG TCTCTACTAA AAAAATACAA AATTAACCAA 
15351 CTCTGCTGCT GCATCCCTCT AATCCCAGCT ACTTGGGAAG CTGAGACAGG 
15401 AGAAI IGCI I GAACTTGGGA AGCAGAGGTT GCAATGACTC GACATCATGC 
15451 CATTGCACTC CAGCCTAGGC AACAAGAGCA AAACTCTCTC TTGGGCTTGG 
15501 CTGGGGGAAA AGCATTTGGA AGAAAGCATA GAAI I PQGTG GCTTGGAGCT 
15551 AGGCAAAGCT TCCTAGGAGA CAGAAGGOAG TTAACATAAA AGAAAAATTG 
15601 GCAAATATAA TCCTGCCACT CTCI ICI 1 1 I TTCTTTAATT TTTTCGGGAG 
15651 CTAGAGATAG GGCTCTTGCT ATGTTACCCA GGCTGATCTC CAACTCCTGG 
15701 CCTCAAGCGA TCCTCCCACC TAGATCCCTC AAAGTACTGG GATTACAGGC 
15751 CTGAGCGACC GTGCCCTOCC CATTCTTGCC AATCTCTTAT AGCAAATACC 
15801 TCTCCCCTCC GGTGACCTGG ATCTGCTAAC CTCCACCCCT GCCTAGACTG 
15851 TGGAAGGATT GCTGGAAGGG TCTCAGTTGC ACAGACCAGG AAACTGAGGC 
15901 CCACAGAGGC AGGTTGTCCGG TTGTTTGCAA CCTCTCAGCC TCTGCTAACC 
15951 CCAATTCTTC AGAGAGAGCC CTGAAACCCT CTCCTCTGGG CGCCCCCAGG 
16001 TCACTCCCCC AGCCTCAAGG GCTGCCTCTG TTGCAGGAAA GACGACGTCA 
16051 CAGGCTACTT CCCCTCCATG TACCTGCAAA ACTCAGGGCA AGACGTCTCC 
16101 CAGGCCCAAC GCCAGATCAA GCGGGGGGCG CCGCCCCGCA GCTAAGCGGG 
16151 GCTCCCCGGG GCTGGGOQGG GTCGAGCGGG GCGCACCACG GGTTCGCTCT 
16201 GTCTAGGCCA TAGCTTOGCA CTGCCGGGGC GGGGGCTCTC AGCCTGGCAG 
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16251 GAG4GGCAGG ACCCTCACGG GGGAAAGGGG CTGGAGGCGC CTGGCCGCGG 
16301 TCTOQQQCTC GCACGGGGGC GG4AGOWVG CGGCGATGCC CGGGGGCTTT 
16351 GGGGATCGGC AGTCEAGGGG GQCTCCCGQG AGAGGGGGAC GACAGACCGA 
16401 AGGCTGGTCA GGGGCGTCGA AAACCGCCCA GGCTCTCCTC CAGGGCAAGG 
16451 GTCCTTCTCG TGAOGGGGGC AGCCGCCTCT TGTCCCGCGG GGGTCGTCCA 
16501 GACTACCGGC CCCCTACTCC CCCCCACTTC CTCGG4CCAG GGGTGCCCAT 
16551 CTGAGTCCCT GGGGGCAGGG GCGCCCTCGG GCTTTGACGA CGCCCCGTCC 
16601 CGCTCGGCCA GGTCGTCCAT CCGCAACGCG CACAGCATCC ACCAGCGGTC 
16651 GCGG4AGCGC CTCAGCCAGG ACGCCTATCG CCGCAACAGC GTCCGTTTTC 
16701 TGCAGCAGCG ACGCCGCCAG GCGCGGCGGG GACCGCAGAG CCCCGGGAGC 
16751 CCGCTOGGTC AGTGCAGCGG GAG^GGGCAG GAAGGGCAAG CCCTAGGGGC 
16801 GGAGTCAGCG GGAGAGGCGG GGCCAGAGGC AGGGCCAGAG TAGCGGGGCG 
16851 GGACCAGAGG GCGGAATCAG AGGGAGAGGC GGGGACTGGA GGCGGGGGCA 
16901 GAGGAGGAGC CAGOGCTAGG GGGCGGAGCG ATCCCTAAGA GGCGGAGTCA 
16951 GAGGGAGAGG CACAAGGGGG AGGCGAGGCC AGAGCGCGGA GCAGGAGTPG 
17001 GAGACOGCGG GGGGGCGAGG CCAGAGAGCG CTGTGGGCGG GGCCAGTGTC 
17051 CGGGGCGGGG GGrCTCACTC GGCCCCGCTC TCPGCCCGCA GAGGAGGAGC 
17101 GGCAGACGCA GCGCTCTAAA CCGCAGCCGG CGOTGCCCCC GCGGCCGAGC 
17151 GCCGACCTCA TCCTCAACCG CTGCAGCGAG AGCACCAAGC GGAAGCTGGC 
17201 GTCTCCCGTC TGAGGCTCGA GCGCAGTCCC CAGCTAGCGT CTCGGCCCTT 
17251 GCCGCCCCCT GCCTUTATAT ACGTGTTCTA TAGAGCCTGG CGTCTGGACG 
17301 CCG4GGGCAG CCCGGACCCC TGTCCAGCGC GGCTCCCGCC ACCCTCAATA 
17351 AATGTTCCTT GGAGTGGACC GAGGGTCTGC AGGAATCCAG GGAGGGCCGG 
17401 GCTCCGCCCC AGGGTTATTT TCTAAGTTGA GGACAGGGAG GTTGTCAGTT 
17451 CTGNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17501 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17551 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17601 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17651 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17701 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17751 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17801 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17851 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17901 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
17951 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18001 NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN NNNNNNNNNN 
18051 NNNNNNNNNN NNNNNNTAAA AATTAGCTCG GCGTCGPGGC ATGCATCCAC 
18101 AATCCCAGCT ACTGGGGAGG CTCAGGCATG AGAATCGCTT GAACCGGGGA 
18151 GGCAGATCTT GCAGTGAGCC GAGACGGCGC CACTGCACTC CAGCCTGGAC 
18201 TACAG4GCGA GACTCTATCT CAAAAAAAAA AAAAAAAAAA AAGTAACTTA 
18251 GGTGONGGGT GTCCTCTCTT ATTCACTGAG ACCGTOCCCC GGTTATGAGG 
18301 TTGTACCAGA AAGCAAGTAT TCACTATGCA CACTATTCAC CGCTCACCCT 
18351 AGCATPGAAG CCAGCCTCTA GCCTGAAAGC CnTGCTTTC AGGGCAGGTC 
18401 TTTCCCCAAA ATGCAGACAC GAAGGXGOVA AGTCAAGCTC CCAGTCTTGC 
18451 AAAAGATCTA ACTTCTCACG AAGGCCACGA GTGGCAGGG^ GAGCTGTCCC 
18501 ACATTTGGGG AAGTOGCTAT GTGAGGAGGG GGGAGGCGGG TCCCTTAGAG 
18551 ATAAGAGACA ATCATAAGGG GAGATATGAG AGAAAATCGT AAGGGGAGCA 
18601 GATGGTTCTC AAGAGAATAG GCTGACCATC GAAGGACTGG CAGAAGCTTT 
18651 CAGAAAACCA CTGGACGGCT GGGGXCAGTG GCTTAGGCCT GTAATCCCAG 
18701 CACTTPGGGA GGCTCAGGCA GGTGAATCAC TTGAGGTCAG GAGTTCCAGA 
18751 CCAGCCTGGC CAACATGGTG AAACCCCATC TCTACAGAAA ATATAAAAAT 
18801 TAGCCAGGCG TCGTCGCACA AGCCTAG4AT CCCAGCTACT TGGGAGGCTG 
18851 AGG 
(SEQ ID NO: 3) 
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context: 
DNA 

Position 

139 TACTAAAMTACAAAATTAGCCA^^ 
GAAGCTT^GCXaGGAGAATCG^^ 
GTCCCACTGCACTCCAGC 
[T,C] 

TCGGC^ACMG^GCGAMCrrCGCrniAM 

TTTCATCTGCOVGA^^ 

CCTGGACTCAGGCMGGGTG^^ 

GO^GCK^CTCGGGCTX^GGCCCGCC^ 

AAGCTCTGG^CTCCCAGGCrCTAG 

262 TACTAAAMTACAAMTTACXICAGGG^^ 
GAAGCTGAGGCAGGAGAATCGC^^ 
GTGCCACTGCACTCCAC^Cr^^ 

AGGCCCAGCA IG1CI I GGCTTTCATCTlGCCAG^CCrCAACCCTCACCCCCAGG^G^T^ 

GTCCGGACCATCAGCPGACCC 

[T,C] 

GGACTCAGGCAAGGGrG^GTTGGPQCAGCCCrQGCCTX^^ 

GGCTGCCTCGGGCTGA^^ 

CTCTCGGCCTCCCAGCCTOA 

ATTCTGCCCCCTCCTCACCTCriTCTCCCAM 

MTATCAGTCTCGAGAAAAC^^ 

43 TACTAAAAATACAAMTTAGCCAGGCGrGGrGGCGG\CACCr 
[A,G] 

TMTCCCAGCXACTTGGGM^ 

7GCAGTC4GCay\GATTGT^^ 

TTCAMCAMTAMTTMCGCCG\GCATGTClTGGCriTCA 

CACCCCCAGGAGATCACCTCC^ 

GGTG^GCCCTCGCCrGCTGGGAGGCACAGGCrGGVQ 

344 TMTCC^GCTACTTGGGAAGCT^ 

TGCAGTC^GCCGAGATTGTGCCACTGCACT 

TTOW\GWVTAMTTAACGCCCAQCATGrCTTa 

CACCCCCAGG^TCAGGTC^^ 

GGTGCAGCCCK^CTGCTGGG^GGCACAGGCrGCAGCAGGCT 
[G,A] 

CCACTCATCAACTCATT^CC^^ 

GGGGACTGGGAGAGAGAGGCCTCAGCOOT^ 

TTCJTCCCAAATCCCa^^ 

AGTCAGCCCTAAGGAAGAATTCM^^ 

CTG\CTGW\GrCTXn"GCG^TTTTCGTCCCAGCrGrCA 

721 AGGCCrCAGCCTGTCCCTGGGGVreCTXXCCCCTCCTCACCT 
TCCT(&CAAA<XTG^CACTCTTMT 

CAATTCAATGAACCA I I PGG I AC^1^GG^TTGG^CT<^CTC^(^CTC^AACTCTG^ 

GCCATTTTCGTICCCAGCTGTCACTGGCCCTCATCCAGACACACCO^ 

AGCKTTX^TGCACACrCCCATGCCCGCGlT 

[A,G] 

TTTCATTCACTGVTTCATTCATTCAC^ 

CGATCCAMTATTTATGGCCTCrGTGTGCCAGGGOAGATGG^ 

CCCCTGATMCCCGGTCATGCCCTAGCITTCCTGGGACACACAT^ 

TAAAAAAATTMGPCAGGCC^GGCAOGGTGGCrCATGCCT 

(m^GGCGAGTC^TTACCTC^^ 

1038 TTOXTTCACTCACTCATTCAT^ 

GGCCTCTGTGTGCCAGGG\CTAGATGGAGGGGCrGGGGC^^ 

CATGCCCTAGCTTTCCTGQGACACACATTGTGGTMGGG^ 

GGCOXGGCACGGTlQGCTCATGCCrG^ 

TACCTG^GGTCAGG^GrrCAAC^CCAG^ 

[A,-] 
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AAAAAAAAAAAAAMTTAGCCAGtrrcrraGTC 

AGACTAACGCAAGAGAA I IGU I CAACCCAGG^GGCAG^GGTTGCGGreAGCCG^G^TOG 

CGG^TTG(^CTCO\GCCT^^ 

GGC^GGCAG^GGCAGG^GGATCACTAGAGGC^ 

AGCAGQ\CCCTUra'G1 ACAAAAAMTTAAAAAAMTTTMCCQQGCATQ7r^GG\G\G\ 

ACTCO\GCCTCGGW\(^ 

AGGCAGGAGGATCACTAGAGKQ^ 

CTUTCreTACAAAAAAATTAAAAAAAATTTM^^ 

CCAGCrACTCC^C^GQCrcAGQCAQGAQGATCGCKiGA 

TX^OTn^TCCCACCACTGCAC^^ 

[A,G,T] 

MC^TAGCMTMTMTAAAOWWV™ 
CCTCTCCAGMAAAGG^GGG^GGAAG^GGCT 
TG^GmTCAAGGAAGGCllOX:^^ 
GmTCG^GGOW^TCGnTC^^ 

GQGG^G^TCCTCAAGGQG^CGCAGOW^TGCAMGGCCCTG^ 
T<*GAG*GTGGCCCTCC^ 

GCCTOTAGrcrcAGrrAGAGrcrrATxicrcr^ 

ATAGAGTCCTCCCATCTCT^^ 
CttCTGGGrCCCCTMTC^^ 

CTTCnATQGMCATTTAGrCCrATGTO^ I I 1 1 I IGI IAI I 1 1 I I IGT 

[G,T] 

TTT7TG*G*CAG^GTmWOT^^ 

ACTGCAACCrCCGCCTTCCAGGTTCAAC PQG1 I CTCCTGCCTO^CTCCAQ\GTAGTTC 
GG^JTACAGGTGCCCACCACCACGCCrGGCTAA I 1 1 I I G I ATTTTTAGTAGAGACAGGGr 
TTttCCATCnGGCO^^ 

CCTCCCAAAGPGCTCX^^ I 1 1 I I I IGAAAT 

(^TCTCTCCCCAAGC^^^ 
CTMTCCCATAGG^AGGCCAGGCAGC^ 

ATTTACTCCTATCTCCCAAGACL I I ICICI I I I I IGI IAI I I I I I I GTG1TTTG4GACAG 

AGTCTTGATCIGI I C^CO^GGCCAGAGTGCAOTa^CG^TCTCAGCTCACTGCMCCT^ 

CGCCTTC^GGTTCAAC^^ 

[G,A] 

CCCACCACCACGCCTGGCTAAI 1 1 1 IGTAI I 1 1 I AGTAG^G^CAGGGTTTCACCATGTTC 

GCCAC^CTGGTTCTCV^^ 

CTCGG^TTAOVGGCATCACK^^ 

GCAGTGGTGCG^TC^TAGTTG^CTGCAGCCTCMCCTCCG\G 
CCTCAGCCCCTrGAGTAGCTGQQGCrAG^GGCGCACACCACCA^ I I I I I A 

CAMGTGCTCGGATTACAGGCATGAGCC^ 

CTGGAGrGCAGTGGT(£GAT^^ 

CCTCCTGCCTCAGCCCCrrGAGTACXTGGGGCrACAG^ 

GTTTTTAAAATnTTXJK^ 

CTCGGCTTMGCAACCCTCrGGTCTC^CT^ 

[G,A,T] 

ACCGrGCCTAGTCACTTTTCTCC I M ICI I I GTAACTTTCAG I I I I OAAATTTCAAATTT 

ACAGAMGGCrACTX5GGTOTCAAAACGGTACCAGTCACTCCAATAGTCr^ 

TCATCCACACCTCPCITTCrGGGGATATT^ 

TGI I ILI I I ACCTCTAAATACTAG I IGI IGGGCAI I ICI IAAAATCAAGGCATTCTCTTA 
CATMT^CAAG^CACGTCJTCAAA^ 

CATTATCCACCCACAG^CTTTA^ 

G4AAACI I 1 1 I I CAGGTCTAGGATCCAGTCMGGATCMTCrCATAGCCTTTM 
TTMTCTGGATCAGTLI II I I ICI I I I ICI I I I ILI I 1 1 I I I GGACACGC^ATCTCACTC 
TCTCGCCAG^CTGGAGTGCAGTGCT^ 
TTCAAGAGATTrTCOT^^ 

[C,T] 

GCCCAGCTCGI I 1 1 1 GGTAG^GACAGGGTTTTGCG^TT^ 1 1 I I 1 1 1 

TTTTTTTTATr^G^TCG^C™ 

ccAcrcAcra^TccrccGCCTCcovGGrrcM 

TAGCTGGGATTACAGGCAKKiGCCACCATGCCCGGCrAC 1 1 I I I GTATTTTTAGTAG4G* 
CAGGGITT^CCATCTTAGCCAGGCTC 



FIGURE 31 



3906 CC^CAG^CTTTAa^GGTTTCCCC^TTATCL RQG ICjI CCTCTGCAGTG^AAACnTT 
TTCAGGTCTAGG^TCCAGTCAAQG^TCAATCT^ 

TCAGTCI I I I I IU I I I ICI 1 1 1 ICI 1 I I I I I GG^ACGG^ATCrCACTCTCTCGCCAQ\ 

CTGG^CTGCACTGGTCC^^ 

TUTCCTTECTCAGC^ 

[-,C,G ] 

TTTTTTGGTAGAG^CAGG^^ I I 1 1 I I 1 1 I I I 1 1 1 IAT 

G^GWOG^GTCTTACTCTUrCACCCAGGCT^^ 
CATCCTCCGCCTCCCAGGTT^ 

TACACX^TCOGCCACCATCCCCGGCTAC I i I I I GTATTTTTAGTAGAGACAGGGTTTCA 
CCATGTTAGCCAGGCTC^TCrCG^CTCCTG^ 

3907 G^G^mTAOX^GGrrrCCCCGATW 1 1 1 1 
Ta\QGTCTAQG^TCO^GTCAAQGATCAAT^ 

CAGTCI 1 1 I I ICI 1 1 I ICI I I I ICI 1 1 1 1 I I GGACACQGMJCTCACTCTUrCQCCAGAC 

TCGAGT^CAGTCGTXSCAATCrcQa 

OttTGCCTONGCOX:^^ 

[-,T] 

TTTTGCTAGaG^G^GGGTTTTGCCATTOXTTCr I I I I I I I I I I I I I I IATC 

AG^TGGACTCTTACTCTCTCACCCAGG^ 
ATCCTCCGCCTCG(^GGTTGV\GCAATTCrCOTGCCTCAGCCrC^^ 
ACAGGCATGCGCG\CCATCCCCGGCTACI 1 1 I ICTAI 1 1 I I AGTAG^G^CAGGGTTTCAC 
(^TGlTAGCCAGGCra 

3911 G^CTTTACTCAGGnrTTCCC^^ I MM ICAG 

GTCTAGG^TCCAGTTCAAQG^TCAATGrCATAGCCTTTAACC I I CM lAATCTGG^TG^GT 
CI I 1 1 1 ICI I ;l I ICI II I ICI I 1 1 I I I GQ\CACGG^ATCTCACrCTCTCGCCAG^CTGG^ 
GK^GTGGPGG^ATCrCGGCTCATTXKAACCT 
TGCCTCAGCCTCCTC^GTAGCTC^^ 

[-,T] 

GCTAG4GACAGGU I 1 1 GCCATTG^TTCTGQ\TCAGTC I M I I M M 1 1 M I IATQ\Q\T 

GG^CTCmCTCTGTCAC^^ 

TCCGCCrCCCAGCTTCAAGCAATTCTCGTGCCTCAGC 

GCATGCGCCACCATGCCCGGCTACI I I I IGTAI I 1 1 I AGTAG^G^CAGGGTTTCACCATC 
TTAGCCAGGCTGATCXCGAACTCCrc^^ 

3932 ATTATCCTGCTTGTXZCTCrcCAGTXiAAAAC I 1 1 I 1 1 CAGGT CTAGG^TCCACTCAAGG^T 

CAATCTCATAGCCTTTAACC I ICI I I MTCTGGATCAGTC I I II I ICI II I ICI 1 1 I ICT 
TTTTTTO£ACACG(^T 

TCATrGCAACCTOaiCrCCTGGGTTC^AG^G^TTCrCCTGCCT 
TQQGAATACAQOTQCGCGCCACCACGCCCAGCrCGI I I I I GGTAGAGACAGGGTnTGCC 
C-,A] 

TTCMTCTGG^TCAGTC I I 1 1 I II I 1 1 I I I 1 1 ATG\G^TGG^GTCTTACrCTGTCACCCA 

GGCTGGAGTGG^ATCGCACMTCTCC^CrCACTGG^TCCTCCGCCT 

ATTCTCCT0CCTCAGCCTCCCG4CT 

TACTTTTTGTATTTTTAGTAG^G^C^GQ 

TCCTX^CCTT^VGCTWC^ 

3934 TATCCICCI I CTCCTCTGCACTG^AMC I II I I I CAGCTCTAGG^TCCAGTCAAGGATCA 

ATCTTCATAGCCTTTMCC I ICI I I AATCTOGATCAGTC I I I I I ICI 1 1 I ICI I I I ICI I I 
TTTTX^CACGGWCTCACTCT 
ATTGC^CCTCTCCCTCCTOGGTTC 

GGAATACAGGTCCGCGCCACCACGCCCAGCTCG 1 1 I I IGGTAGAG^CAGGCTTTTGCCAT 
[-,T] 

GATTCPGGATCAG I C I I I 1 1 1 1 1 1 I II I I I ATGAG^TGG^CTCTTACTCTCTCACCCAGG 
CTCG^GTX^ATXXKACAATCTCCACT 

TCrCGTGCCTG\GCCTCCCG^Gr AGCTOGG^TTAC^GQCATGCGCCACCATGCCCGGCT A 

CrTTTTCTATTTTTAOTA 

CT^CCTX^GCnWC^ 

3949 CTCCAGTCAAAACI 1 II I I G^GCTCTACXATCCACTO^ 

ACCI ICI I IAATCTCGATCAGTCI I II I ICI I I I ICM I I ICI I I I I I I (jGACACGGAAT 

CTO\CTCnCTO^CAG^CTCGAGTCCAGTl^ 

TCCTGGGTTCAAGAG^TTCrCCTCCCrCAGCCrCCTGA^^ 
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GCCACCACGCCCAGCTCG 1 1 1 1 I GCTAGAGfrCAGGG 1 1 1 I GC CATTWTCT^TCAGT 

[C,T] 

TTTTTnTTTTTTTTATX^ 
ACAATCTCCAOOOTCAT^^ 

TCCCG^GTAGCTCGG^TT^^ 1 1 I I ICTATTTTTA 

GTAG^GAO^QCjrrrCACCATCTTAQ^ 

CTC£CCGCCT<X4C^^ 

3994 ATCrCATAGCCTTTAACC > ICI I I A ATCTCGATCAGT L ICI 1 1 I I C I I I I I C I I I 

TTTTC^CACGC^TCTCAa 
ATTTEAACCTCTGCCr^^ 

GGAATACAGGTOCQCGCCACCACGCOIAQCrcG 1 1 1 I I GGTAGAGACAGGG I I I IGCCAT 
TWTCTGGATCAGTCI 1 1 1 I I I I I 1 1 I I I I ATXy^TCG^CTTACTCTCrCACCCAG 
[A,G] 

TCTCGTC£aX^GCCT^^^ 

CFirTTXnATTTTTAGrAGAGACAQQCJriTCACCATGrr ^GCCAGGCTGATCTCGAACTC 
CTCttCGTCAGGTWaT*^^^ 

CACCGTX^CAGCGG^TTCTGG^TCGGTCTTMTG^GTCrTTOT 

6272 AMCTAAMCAGACAGC^TCTCCC^^ 

TQWWXOWVGCCAAGTTCTAG^TCCCAMTAAATGC^ 
TGTGGrrCTCGrCCCTATGrrAGTrATTTTCCT 

ATGACTTATCAG 1 1 I I 1 1 IATATTTGLI I I ICI 1 1 1 GAGATGGGGrCmQCTCTGTCACC 

CAGGCTGGGGTCCACnX^GCM^ 

[C,T] 

G^TTCTCCCATCT^GCCTCC^ 

CmTCCTGM I K, I I I 1 1 I G I AG^G^TCQC^TTTCGCTATGrrGCCCAGGCTGGTCrCT 
MCTCCTGG^CTCAAGK^T^^ 

CaACAG^CAAGAGTlGG^CGQGQGACACAAGG^GGCCATTT^^ 

6427 ATTTTACATTTCTACC I 1 1 I IAAGAATGAG1TATCAGI I I I I I 1ATAI I IGCI 1 1 ICI I I 

TG^GATGGCCTCTTO 
GCAGCCTG^ACCTCCAGGGCTG^ 

CAGGTGrGCACG^CCACACCTGGCTCCrnTCCTG^ I I lb! I I I 1 1 GTAQVGM1QQGATT 

TCGCTATCTTCCCC^^ 

[T,C] 

TCCOWVTTCCTAGGATTACAGGrniGA^ 

GG^TCCCTTT<^G4Cr^ 

ATTTTCGTTATCCAGGCCTCCTAC^^ 

CT<^G4TCCG^AA<^(AT^^ 

GQGAGQGAGAG^QGTCAGTGCCrCTCTTMTCAA^ 

ACAGACAAGAGrGG^CGQQQGACACAAQGAGGCCATTTTCGrrAT^ 

GCTAGGGC^GGAGGGTOGGGTTGGTIGGG^G^ 

TCCAAGCAAAAGG^TTTCCT^ 

[T,C] 

TMT<^GGAATCO\GATf^^ 

CACTCTGTCACCCAGG^TGG^GrGCAGrGGCGCCATCTCGGCT 
CCCAGGITTAAACG^TTCTCCCACCT^ 

CCACCACTCCCGGCTAAI I I I ICTAI I 1 I lACTAGAGACGGGGnTCACCACGTTGGCCA 
QQCTGGrcrrGAACTCCKaACCrCAAGniATCCAC 

7741 COT^TT^CCTO^ 

GGAGAQGG^AGGAGGGGTIGQG^CI I ICIGIGI 1 1 I CCAQ\TX^GG^AACCAAGGCrCAG^ 

QVQQGAAAGCCACCrrCCO^iAGCCACAO 

GCCCCTTniGACCCGGC^^ 

GCACrCGGGGCn^GGTTC^^ 

[C,T] 

ATO^CCTCjGAGCCACTCACCACTCCT 

GCAGTX^QG^TCAGATWAQCATTXaQGIGICI l(aCX5G^CCCCC^GAAGGAlXnX5QQOT 
TGMlGCCrCTGCTAAOTGCI GAGCA IGICI GQQGKTCCrcrACCCAQGACCCTCTGTQG 
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AAGXACCTC^G^^ 



8294 GMGCTCGAAGTTm^GCATT^^ 

C^G^CAG^G^CA^GCACK^TKOXICGGCCAGCAC^^ 

OGGCCCMGG^GG^ 

GTCGGGTCCGCTCCrCATTCC^ 

ACAGTOGCTCACACCTATMTCCCAGC^^ 

[A,G] 

GrreGG^Grniy\G^C<^GCCTGGCCMCATO^ 

AAAATTTAGTCAGGCATGGT^^ 

AGG4GMTCGCnT^CCC^^ 

AGATCGCCTXaQG^GCCCGCAAG^GCATTTTCCAGQC^ 

9313 TCGGCQSGGn^GGnXCT 

G*T<^ClTG4GCTtAGG4GTT^ 

CTAAAAATCCAAATTAGCCAGGC^^ 

GCTGAGQCAQG^GAATTGCTTGAACCCQGAAG^ 

CCACTCCACTCCAQCCTGGGO^GCAG^GCCAG^CTC 

[A,-,G] 

AAGWTGGGTUTTTCG^AGC^^ 
CAGAGACCA(£GCAG*CCCTra 
G^CAGGTTTTCGAGGGCAGCCTX^ 
ATACTTATTO^CAGrCACT^ 

GIGU lACTCTUTQCCAGCCACTAI I I IGI 1 1 I IGI IGI I I I CAGTQXCAGGGTCTCGCT 

10838 CnnACTCTAGCCTGGGC^CAGAGCMGACrCTGrCTCCA 
C^MGGCTCACrMCrrCATCAG^TOVG^ACAA 
[G,A] 

G^CATGTTTGAAGTlCTGAGGCCG^GCCrGGAGAACa^^ 
CAGCAGAGACKA^GATGCCAGCGCCPCTTCTG 

11093 AMTAAAGTTGGGAAAGGCTCACTMCTT^ 

GATGCCAGCGCCTGTTCTCGAGGCC^ 
CCOXTTGCCG^GACAT^^ 
CTACGAGAAG^CCTCGGGCrCCGAG^TCGCrCTGrCG^ 
[G,A] 

GAG^AGAGCGAGAGCGGTCAGACCTCC^CCTTAGGGGGCrc 

MCCCAC^GCG^GVVGCCC^^ 

GCTCTGTTAGGGGCCCTAAATXnXICrcCCCACACr^ 

CACCCTGTCCTrcKSCrGTGGGCATC^ 

TTCTGTCTGG^TGGGTATGGGACCGrCTGrrCATTATG^ACT 

11195 AGGGCAGCAGAGACTCMG^TGCCAGCGCCTGrrCTGG^GGC 

GTACCGCGCCATTCCCG4CT 
GGACGTGGTTGGAGGTCCTCX^^^ 
CCrrCCCCTGGTGCTCAGG^^ 
[T,G] 

GCCCCTCGGAGGACTCCAGCrcrcrrAGGGGCCCTAMTOT^ 

GCcmrrcraTAGrGTCCACccrGR^^ 

GCGGGGCATGTCTGCGTCT^^ 

GGGCTC^GAGCrGTX^TTCTCTGAGCATCreT^^ 

GrcTOGTXiMGCrPG^CATTTCCAAATCT^ 

11213 GATGCCAGCGCCTCTTCTCC^GGCC^ 
CCCTCTTGCCCCAG^T^^ 
CTACG^G^AG^CCTCGOGCT^^^ 

GG^G^V\G^GCG^GAGCGGTCAG^CCTCCCACCTTACGGGGCTC 

GMCCCACAGC<^CAAGCCCC^^ 

[G,A] 

GCTCTGTTAGGGGCCCTAM 
CACCOUKKJreGCTGTCGGCATC^^ 



FIGURE 3L 



TTOXJTOT^TCQOTATCGG^a 

OTJTT^CKATCTUrcCAT^ 

TTCCAAATCTX^GCATTGG^CAT(^CT 

11263 ACCCTCCCCTCCl I I I I C^CCCAG^(^TCACCQGCCCCAT(^TCCTGCAG^CGrACC^ 
CCATTCCCG^CTACG*^ 
TOttGGTCGTGGAGAAGAGCGAGAGC^^ 
TQOTQCTiCAGG^CCCACAGCCACAAGCCCCCK^ 

GGAGGACTCCAGCICIGI I AGGGGCCCTAMTCTCCTCCCCAGXCrGTGQGTCGCCTTCr 
[C,G] 

TCTTAGTXnXXACCCrcrGG-| QQCTXnTXK^TCTCTQCATQQCAGGCCQGQGCQQQQGV 

TCTCTCCGTGlTCTGTCrc^ 

AGCTCTWTOUTGAGC^^ 

AAGGTT^<^TTTCOWVTT^^ 

CCATCCOX^TCGCTGCAQQGAGCCQCroC^ 

13707 GQGGTCTTTAGGG^TCTIGGGG^ 

TXTGGGCnAGTTTCCATCrCAGTAGCA^ 

GGGGATCCMTCTCCTTTnCCAAGTCX^ 

CG^CCATCTATTATTATTCCTTCTCTCGG\ACA 

ATTCCTAGCTAGGC3GCMTGGCCCAGGCCTCT AATCCCAGG^CTTTGQGAGCCCAGGAGV 
[G,A] 

GACG*TCACGTT2\(CTCAGGAG1T^^ 
CTACTAAAMCftCAAACATGAGCCGGGlTJrrG^ 
GGGAGTCTGAG^CMG^G^TC^CXrcAACCCGGGAGGCGG^ 
TCGCGCGATTCXACTCCAGG^^ 

AAAAAAAAAGATTA U 1 11 1 I M I ATCATTCCTTTATCXrTTAAAGl I 1 1 11 IGCAGTCA 

14629 TGTlTATCCrCCAMTCAATGG^GMATACrMTTAT L I 1 I U I CT GGTTCTGGGGAACA 
CAGAATTOAGCOXTTXJTiGGAGCCAT^ 

CCTUrcrCTTCAl I I 1 1 IACGAAI I I I I ICAI I I 1 1 I GAGACAGGATCTnQCTCPGACTXI 
CCAGGCTQG^GCAG\AT(^TCGCr(^CTCMGCG^TCCT 

GCTOGC^CTACAGGrGAGCACCACCACATCrGGCTAA I G I I I I I IAAI I I I 1 1 IGTAGGG 
[G,A] 

TGGGGTCTG^CTATT^GCCMGACTAGTCTTAAACT 

CCITGGCCTCCOWV^ 

CGTCTTCAOAGCCCTAGGrCAC^^ 

ATTTCTG\GCC1TTTAG^ 

14698 TTCAI I I 1 1 IACGAAI I I I I ICAI I I I I I G^GACAGGATCTTGCTCrGACrCCCAGGCTG 
GAC^CAATC^TCGCXCACrCA^^ 

TACAGGTXy\GG\CO\CCACATCTGGCTAATG I I I 1 1 IAAI I I I I I I GTAGGGGTGGGGTC 
TCACTATGGTGCCAAGACTAGrcrTAMCTC^ 

[T,A] 

ccqw\gcactcggatta(^ggaatgagcct^ 
agccctaggtcacagggccagcctggcgccctgcoqcaagcitatc^ 
agv^tgcatacox^gtc^ 
ccttttagacacacacrcrgttaacccccatcctgtc^ 

C 

16095 MTACcrxnxiccara^ 

AGGATTGCTGGAAGGGTCTCAGTTGCACAGACCAGGAM 

GTCCGGI lb I I IC^CCT(T(^GCCTGr<CTMCCCC^TTGrTCAGAC^GAGCCCrGA 

MCCCrCTCCnCTCGGCGCCCCCAGGTC^^ 

AGGAAAGACGACGTCftCAGGCrACTTCCCGrCCA^ 

[C,T,G] 

TOTCCCAGGCCCAACGCCAGATCAAGCGGGGGGCGCCGCCCCGCAGGTAAGCGGGG 

CGGGGGCTGGGCGGGGTCGAGCGGGGCGCACCACGGGrrCGCrcrGTCTAGG^ 

TCGCAGTGCCGGGGCGjGGGCTOC^ 

agqggctcgacgcgcctggccgcggtutogggcto 
atgcccgggggctttcgggatgggcagtccaggggggctccccg 

16266 gagccctowkcctct^^ 



FIGURE 3M 



CTCTGTTCCAQG^AAG^CG^CGrCACAGGCTACrTC 

GQGCMG^CGTCTCCGVGGCCCMCGCCAG^TCAAGCGQQGGGCGCCGCCCC 

GGGGGGGTCCCCGGGGCKX^GGGGTT^^ 

GGC(*TAGCTTCGCAGrGCCG^ 

[C,T] 

AAAGCGCtG^TGCCCGGGGGCXnGG*^ 

GGACGACAG^CCG^AGGC I GG I CAQGGGCGnPGGAAAACCQCCCAGGCT C I GC I GCAGGGC 

CC^CCCCTACTGCC^^ 

16629 AGCGGCGATCCCCGGGGGCTITQGGGCTQG^ 

AOGACAGACCGAAGGCTGGPCj^GGGGCGTG(^AAACCGCCC^GGCTOT^ igcagggcaa 

GGOTCCrrUrCGn^CGGGGGCAGCCGCCTC I I G I CCCGCCGGGGTCGPGCAG^CrACGG 

GCCCCCTACTGCCCC^^ 

GGGCC^CCTCGGGCTTT^^^ 

CC,T] 

GGGAOVGGCGGG^^ 

GAGGGAGAGGCGGGGACliGGAGGCGGGGGCAGAGG^GG^GCCAGCGCT 

16642 GGGGGCTTTCGGGWGGGCAGrCCAGGGGGGCTC^^ 
GGCTCXTTG^GGGCKjCTGG 

G^CGGGGGCAGCCGCCTCTTGrCCCGCCGGGGTCGTGC^CTACGGGCCCCCT^ 

CCCCACTTCCTCGGACCAGGGGTGCC^ 

Crrre^CGACGCCCCGrCCCGCTGGGCCAGCTCGT^ 

[C,T] 

CAGCGCTCGCGGMGCGCC^^ 1 1 I ICIG 

CAGCAGCG^CGCCGC^GGCGCGGCCGGG^CCGCAGAGCCCCGGGAGCCCGCTCGC^ 

TQCAGCGGGAGAGGGCAGGAAGGGCAAGCCCTAGGGGC^ 

CG\GAGGCAGGGCCAG^GTAGCGGGGCGGG^CCAG^GGGCGGAATG\G^G 

GGaCTCG^GGCGGGGGCAG^CK^GGAGCCAGCGCTAGGG 

18537 AAAAMGTMCTTAGGTGCAGGGrGrC^ 

GAGGnCTACCAGAAAGCAAGTATTCACTATGCA^ 

G^GCCAGCCTUTAGCCTGAMGCCrTTGC^ 

A(^CG^GGTGCAMGrG^AGCrGCO^GTCrrGCAAAAGATGTM 

ACG^GTGGCAGGGAG^GCTGTCCCAC^TTTGCGG^AGTO 

[C,T] 

GGGTCCCTTAG^G^TAAG^GAC^ 

GCAG^TGGrPOTCAAGAG^ATAGGCTGACCATCQW^ 

CCACTGG^CGGCTGGGCACAGr^^ 

GCAGGTGAATCACTTG^GGTCAGGAGTTC^ 

ATCTCTAC^GAAMTATAAAMTT^ 

18589 CCX5GrrTAT^GGTTGTACCAQW\GCAAOT 
CTAGCATTCMGCCAGC^^ 
AAATGCAG*(^GAAGCTGC^^ 
CG4AGGC<^CG4GTGGCAGGG4G4^ 
GGGGGAGGCGGOTCCCTTAGAGATAAGAGACAATCATM 
[G,A] 

TAAGGGGACXAGATGGrrcrCAAGAGAATAGGaX^CC^^^ 

TCAG^AMCCACTGGACGGCTCGGCACAGTTXKTrAGGCCTOT 

AGGCTG^CGCAGGTGMTCACTTG^GGTCAGG^Gl^ 

GAAACCCCATCTCT ^CAGAAMTATAAAMTTAGCCAGGCGTGGTGGCAC^GCCrAQ\A 
TCCCAGCTACrnGGG^GGCTG^GG 

18720 CG^GGTCCAAAGTGAAGCTK^ 

AGTGGO\GGG4G4GCTGTCCCACA^^ 

GTCCCTT AGAG^TMGAG^CAATCATMGGGG^GATATCAGAGAAAATCGTMGGGG^GC 

AGCTCCTTGTCAAG^GWAG^ 

ACTGGACOCCTTGGGCACAGTCGCTTAGGC^ 

[G,A] 
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GjTT^TCACTTCAGCT 

TCTAO\G*AMTATAAA^ 

TGGGAGGCTCAGG 

18782 TCGCAGGC^G^GC^^ 

CCCXTAGAG^TMG^C^TW^ 
ATCGnOTG^AGAG^TAQQCTC^ 

[C,T] 

TACaGAAAATATAAAMTTAG^^ 
GG^QQCTCAGG 

18841 TCCGTAGAGATAAGAG^CAATCATAAGQQG^GATATONG^ 
GWGGTTCT^GAG^ 
CTQGACQQCrcQGCAO^QQCTT^^ 
GGTCAATCACTTCAGCT^^ 
TCTAOKAAAATATAAAAATTAG^ 
[C,T] 

GGGAGGCTGAGG 

Chromosome mapping 
Chromosome 19 
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